MATH333: Assessment Week 19

Assessment Deadline: Week 20, Tuesday 5pm.

Consider the titanic data set that was introduced in the last assessment. Here, the event that ith passenger survived the sinking RMS Titanic is assumed to follow a Bernoulli distribution with some unknown probability μi. The unknown probability is related to the linear predictor ηi via the logit transform: i.e. logit(μi)=ηi. The data set titanic contains three explanatory variables:

  • pclass – passenger class, either 1st, 2nd or 3rd.

  • sex – the gender of the passenger, either male or female.

  • age – the age of the passenger in years.

  1. 1.

    Let the indicator variable siM be 1 if passenger i is male or 0 otherwise, and the indicator variable siF be 1 if passenger i is female or 0 otherwise. An analysis proposes the following linear predictor:

    ηi=β0+β1siM+β2siF.

    Is this a good linear predictor? Justify your answer. [1]

  2. 2.

    State the linear predictor that contains age and passenger class. Explain any notation that you introduce.

    [Hint: To get started let 𝐚 denote the vector of ages, with the ith entry corresponding to the age of the ith person in the data set, and let let 𝐜1st be an indicator vector that is 1 in the ith position if the ith passenger is in 1st class, 0 otherwise. Construct other vectors as required carefully defining them and use these to formulate the linear predictor]

    [2]

  3. 3.

    State the linear predictor for age and passenger class with an interaction term between these two explanatory variables. [1]

  4. 4.

    Explain what effect including an age and class interaction term has on the linear predictor. [1]

  5. 5.

    Using R, fit a generalised linear model to the Titanic survivor data where the linear predictor contains passenger class, the gender of the passenger and the interaction between these explanatory variables. Write down the R command you used to fit this model. [1]

  6. 6.

    Estimate the probability for a female 1st class passenger to survive the sinking of the Titanic. Calculate a 95% confidence interval. [2]

  7. 7.

    Estimate the probability for a male 2nd class passenger to survive the sinking of the Titanic. [2]